\section{Inverted Index}
\label{sec:iv}
The inverted index is used to tell which terms that appeared in which documents. A term is a specific word which is not a stopword. We have a data-type/class called \textit{Term} which contains the term and a list of \textit{KeyValuePairs}. Both the key and the value of the \textit{KeyValuePairs} are integers. The keys represents the document ID's, and the values represent the number of times that a term is written in that document. The index is built dynamically at runtime, and is lastly sorted by the name of the term. The terms are extracted from each document by in this way:
\begin{itemize}
	\item Tokenize the contents of the document \ref{sec:tok}
	\item Remove stop words and stem the tokens
	\item Add the stemmed tokens and their count together with the document ID to the inverted index
\end{itemize}

\ind{jaccardSimilarity}